Butterfly: Privacy Preserving Publishing on Multiple Quasi-Identifiers
نویسندگان
چکیده
Recently, privacy preserving data publishing has attracted significant interest in research. Most of the existing studies focus on only the situations where the data in question is published using one quasi-identifier. However, in a few important applications, a practical demand is to publish a data set on multiple quasi-identifiers for multiple users simultaneously, which poses several challenges. How can we generate one anonymized version of the data so that the privacy preservation requirement like k-anonymity is satisfied for all users? Moreover, how can we reduce the information loss as much as possible while the privacy preservation requirements are met? In this paper, we identify and tackle the novel problem of privacy preserving publishing on multiple quasi-identifiers. A näıve solution of respectively publishing multiple versions for different quasi-identifiers unfortunately suffers from the possibility that those releases can be joined to intrude the privacy. Interestingly, we show that it is possible to generate only one anonymized table to satisfy the k-anonymity on all quasi-identifiers for all users without significant information loss. We systematically develop an effective method for privacy preserving publishing for multiple users, and report an empirical study using real data to verify the feasibility and the effectiveness of our method.
منابع مشابه
TrPLS: Preserving Privacy in Trajectory Data Publishing by Personalized Local Suppression
Trajectory data are becoming more popular due to the rapid development of mobile devices and the widespread use of location-based services. They often provide useful information that can be used for data mining tasks. However, a trajectory database may contain sensitive attributes, such as disease, job, and salary, which are associated with trajectory data. Hence, improper publishing of the tra...
متن کاملCryptanalysis of Basic Bloom Filters Used for Privacy Preserving Record Linkage
Linking databases containing information on individual characteristics and behavior is of increasing scientific and commercial interest. In many applications, linking databases has to be done without a unique personal number. Hence, due to privacy concerns, privacy preserving record linkage (PPRL) is used most often. In this context encrypted personal quasi-identifiers such as first names, surn...
متن کاملSLOMS: A Privacy Preserving Data Publishing Method for Multiple Sensitive Attributes Microdata
Multi-dimension bucketization is a typical method to anonymize multiple sensitive attributes. However, the method leads to low data utility when microdata have more sensitive attributes. In addition, the methods do not generalize quasi-identifiers, which make the anonymous data vulnerable to suffer from linked attacks. To address the problems, the paper proposes a SLOMS method. The method verti...
متن کاملProtecting the Publishing Identity in Multiple Tuples
Current privacy preserving methods in data publishing always remove the individually identifying attribute first and then generalize the quasi-identifier attributes. They cannot take the individually identifying attribute into account. In fact, tuples will become vulnerable in the situation of multiple tuples per individual. In this paper, we analyze the individually identifying attribute in th...
متن کاملPrivacy-Preserving Publishing Frequent Sequential Patterns
Releasing frequent sequential patterns can compromise individual privacy of underlying sequences. We propose two concrete objectives as a potential standard for privacy-preserving publishing sequential patterns: k-anonymity and α-dissociation. The first one, extended from k-anonymity model for data, addresses the problem of inferring patterns with very low support, say, in [1, k) where k is an ...
متن کامل